I first loaded required packages for data munging, visualization, and analysis (these are largely Hadley Wickham libraries, plus some Bioconductor tools).
rm(list = ls())
cleanup <- gc(verbose = FALSE)
# Load libraries I'll need here
library(edgeR)
library(limma)
library(biomaRt)
library(ggbiplot)
library(cowplot)
library(sva)
# Load my go-to libraries
library(dplyr)
library(ggplot2)
library(ggthemes)
library(stringr)
library(readr)
library(readxl)
library(reshape2)
# Packages for R markdown stuff
library(knitr)
library(shiny)
Functions for plotting metrics are contained in metric_qc_functions.R.
source("R/metric_qc_functions.R")
This is a function written by Elizabeth Whalen (shared by Michael Mason) that might come in handy with some steps of the analysis. I modified the function slightly, such that library sizes are updated and normalization factors are calculated after filtering genes. I also added the option to input gene annotation information.
# Function to build DGEList object, filter genes by keeping only those having %
# samples with at least N counts, and computes normalization from library sizes
setUpDGEList <- function(countData, geneData = NULL,
filterCount = 1, filterPercentage = 0.1)
{
d <- DGEList(counts = countData, genes = geneData)
# d <- calcNormFactors(d) # moved further down
# Filter all genes that do not have at least 'filterCount' counts per
# million in at least 'filterPercentage' percent of libraries
keepRows <- rowSums(round(cpm(d$counts)) >= filterCount) >=
filterPercentage*ncol(countData)
print(table(keepRows))
curDGE <- d[keepRows, ]
# James: I've added this change so that library sizes and normalization
# factors will always be updated/calculated after filtering genes
# reset library sizes
curDGE$samples$lib.size <- colSums(curDGE$counts)
# calculate normalization factors (effective library size =
# lib.size * norm.factor)
curDGE <- calcNormFactors(curDGE)
return(curDGE)
}
Next, I read counts and metrics data for the project into R, along with sample annotation for project libraries.
# Read CSV file with read counts
countFile <- "data/HMMF2ADXX_combined_counts.csv"
countDat <- read_csv(countFile) # 37991 obs. of 18 variables
# str(countDat)
# Read CSV file with RNAseq/alignment metrics
metricFile <- "data/HMMF2ADXX_combined_metrics.csv"
metricDat <- read_csv(metricFile) # 16 obs. of 71 variables
# str(metricDat)
# Read XLSX file with sample annotation
designFile <- "data/JMD119 Sample Information .xlsx"
designDat <- read_excel(designFile, skip = 1) # 36 obs. of 18 variables
# str(designDat)
I needed to do a bit of cleaning/formatting with variable names (column headers) to make life easier and avoid breaking downstream functions.
# Separate gene counts and gene symbols into separate objects, reformat
# variable names in countDat to only include libID
geneDat <- data_frame(ensemblID = countDat$geneName)
countDat <- countDat %>%
select(-geneName)
names(countDat) <- names(countDat) %>%
str_extract("lib[0-9]+")
# Reformat variable names in metrics data frame
names(metricDat) <- names(metricDat) %>%
str_to_lower() %>% # change variable names to lower case
make.unique(sep = "_") # de-dup variable names
names(metricDat)[1] <- "lib_id" # reformat libID variable name
# Reformat row names in metrics dataframe
metricDat <- metricDat %>%
mutate(lib_id = str_extract(lib_id, "lib[0-9]+"))
# Reformat variable names in design data frame
names(designDat) <- names(designDat) %>%
str_replace_all(" +", "_") %>% # replace spaces with underscores
str_replace_all("#", "num") %>% # replace # with 'num'
str_replace_all("/", "_per_") %>%
str_replace_all("(\\(|\\))", "") %>% # remove parentheses
str_to_lower() %>% # change to lower case
str_replace("(?<=(lib))[a-z]+", "") %>% # replace 'library' with 'lib'
make.unique(sep = "_") # de-dup variable names
# Remove empty rows from design data frame
designDat <- designDat %>%
filter(!is.na(lib_id))
I created a new object to store the salient information about groups in the study I want to compare.
groupDat <- designDat %>%
# extract knockout status (WT or BCAP) and HSC population (long or short'
# term); combine into a single group vector
mutate(koStatus = as.factor(tolower(str_extract(sample_name, "WT|BCAP"))),
hscPop = as.factor(tolower(str_extract(hsc_population,
"Long|Short"))),
group = as.factor(str_c(koStatus, hscPop, sep = "_"))) %>%
select(libID = lib_id,
koStatus, hscPop, group)
For reference, here are the relevant groups in the data (stored in groupDat):
| libID | koStatus | hscPop | group |
|---|---|---|---|
| lib7418 | wt | long | wt_long |
| lib7419 | wt | long | wt_long |
| lib7420 | wt | long | wt_long |
| lib7421 | wt | long | wt_long |
| lib7422 | bcap | long | bcap_long |
| lib7423 | bcap | long | bcap_long |
| lib7424 | bcap | long | bcap_long |
| lib7425 | bcap | long | bcap_long |
| lib7426 | wt | short | wt_short |
| lib7427 | wt | short | wt_short |
| lib7428 | wt | short | wt_short |
| lib7429 | wt | short | wt_short |
| lib7430 | bcap | short | bcap_short |
| lib7431 | bcap | short | bcap_short |
| lib7432 | bcap | short | bcap_short |
| lib7433 | bcap | short | bcap_short |
Next, I looked at a few standard metrics to see whether any libraries should be excluded due to quality reasons.
# Pull out and format the subset of metrics to plot
metricSummary <- metricDat %>%
mutate(percentDuplication = unpaired_read_duplicates /
unpaired_reads_examined) %>%
select(libID = lib_id,
medianCVcoverage = median_cv_coverage,
fastqTotalReads = fastq_total_reads,
percentAligned = mapped_reads_w_dups,
percentDuplication)
Cutoffs are set by default to standard values used in the Bioinformatics Core for three metrics; libraries are considered to have ‘failed’ QC for the following conditions:
I can also adjust slider bars to look at different QC cutoffs (red lines) for the x- and y-axis; dashed lines indicate outlier limits (1.5*IQR).
Percent aligned is simply the number of FASTQ reads for which there is a corresponding alignment in the TopHat BAM file. In other words, percentAligned = # of aligned reads (+ all their duplicate reads, which were removed from the final BAM) / fastqTotalReads.
Percent aligned:
Percentage of aligned reads is well above the 80% cutoff for all libraries, with rates in the mid-90s across the board. lib7422 is outside the nominal outlier range, but still has 91.41% alignment.
Total reads:
While all libraries had well over 10 million reads in the input FASTQ file (after adapter trimming), lib7418 appears to be quite a bit smaller than average the average of 24.27 million reads, with only 13.04 million reads.
Median CV coverage is calculated by Picard by
A high CV of read coverage suggests possible 5’ or 3’ bias, or potentially non-uniform (“bumpy” or “spikey”) coverage of a transcript. If medianCVcoverage is high (> 1), this could indicate a more systemic problem with coverage in the dataset.
Median CV coverage:
All libraries look good (with medianCVcoverage generally close to 0.5) in terms of gene coverage among the top 1000 transcripts.
I plotted the (smoothed) frequency of log-normalized counts in each library, just to get a sense of the distribution.
Without any filtering, there’s a large spike at -1, representing genes with a count of 0, along with a smaller bump between 0 and 1, representing genes with very small counts. Notably, the distribution of counts for lib7418 is shifted dramatically to the left (which makes sense, given the much smaller number of pre-aligned reads).
I used the function defined above to build the DGEList object for the data, which is the input for downstream functions.
# Filter genes with (cpm > 10) in < 10% of samples
dge = setUpDGEList(countData = countDat, geneData = geneDat,
filterCount = 10,
filterPercentage = 0.20)
## keepRows
## FALSE TRUE
## 26239 11752
Keeping only those genes with >= 10 counts per million in at least 20% ( 3.2 samples), we’re left with 11752 genes.
While there are still a small fraction of genes with zero or very few counts, these no longer account for such a large percentage of genes across all libraries.
Correcting for library size, the distribution of counts per million (CPM) is more similar across all libraries, including lib7418.
To verify the effect of the change I made to Elizabeth’s code above, I plotted norm.factors and effective library size (lib.size.eff) under two scenarios:
norm.factors are calculated before genes are filterednorm.factors are calculated after genes are filtered and library sizes are updatedFor the not-too-stringent threshold used to filter genes (CPM > 10 in 20% of samples), the order of operations for calculating norm.factors appears to have minimal impact on effective library sizes.
I used the biomaRt package to add gene symbols (from MGI) corresponding to gene IDs from Ensembl.
# Get MGI gene symbols corresponding to Ensembl Gene IDs
dgeGeneDat <- dge$genes
mart <- useMart(biomart = "ensembl", dataset = "mmusculus_gene_ensembl")
ens2Gene <- getBM(attributes = c("ensembl_gene_id", "mgi_symbol"),
filters = "ensembl_gene_id",
values = dgeGeneDat$ensemblID, mart = mart)
ens2Gene = ens2Gene[match(dgeGeneDat$ensemblID, ens2Gene$ensembl_gene_id), ]
# Insert MGI gene symbols for genes in DGE object gene info
dgeGeneDat <- dgeGeneDat %>%
mutate(mgiSymbol = ens2Gene$mgi_symbol,
mgiSymbol = ifelse(is.na(mgiSymbol), "NA", mgiSymbol))
dge$genes <- dgeGeneDat
I performed PCA (on log-normalized CPM values) with the prcomp function and plotted the first two principal components using the ggbiplot function from the ggbiplot package.
The second principal component (PC2) appears to explain the difference between long- and short-term HSC populations – this is clearly a strong singal in the complete dataset. PC1, on the other hand, is not obviously correlated with any experimental group or any of the metrics I examined above.
Among the first two principal components, there is no clear clustering by KO status.
Looking at some of the other top principal components, PC4 appears to have the strongest (though not perfect) relationship with KO status. When PC4 is plotted against PC2 (related to HSC population), the samples separate reasonably well into the expected groups. PC3 seems to have almost an inverse relationship with PC2.
To try to get a better sense of what PC1 might represent in the data, I plotted several experimental, sequencing, and alignment variables against PC1 values.
While PC1 does appear to show some relationship with KO status, the two extreme libraries on the PC1 axis (lib7418 and lib7427) both also have very low RNA concentrations (ng_per_ul). None of the other variables seem to be particularly associated with PC1.
I also plotted the same set of variables against koStatus.
None of the variables jump out to me as being obviously correlated with koStatus, with the exception of mice age_in_weeks. Because KO mice are all at least 1/2 week older than WT mice, I don’t believe there’s any way to effectively rule out this variable as being a source of expression changes. However, I believe the age difference is small enough that the effects are relatively small.
Despite the slope of some of the fitted lines, none of the RNA-seq quality metrics appear to be particularly skewed by koStatus.
Design 1: expression ~ koStatus
The first, possibly most naive, model I trained was predicting expression as a function of BCAP knockout status (i.e., koStatus).
# Define the design matrix, only including terms corresponding to KO status;
# use voom to calculate transformed expression values
koDesign <- model.matrix(~ koStatus, data = groupDat)
koVoom <- voomWithQualityWeights(dge, design = koDesign,
plot = TRUE)
Using voomWithQualityWeights from the limma package, several libraries are downweighted in the normalized data object. The two libraries with the lowest sample weights are once again those with low ng_per_ul values (and also with extreme PC1 values).
# Fit model for the group design
koFit <- lmFit(koVoom, koDesign)
koFit <- eBayes(koFit)
koResults <- topTable(koFit, number = nrow(dge))
This model produces 10 significant genes:
| ensemblID | mgiSymbol | logFC | AveExpr | t | P.Value | adj.P.Val | B |
|---|---|---|---|---|---|---|---|
| ENSMUSG00000025035 | Arl3 | 1.33 | 5.5 | 8.1 | 0 | 0.003 | 6.4 |
| ENSMUSG00000092250 | Gm20467 | -0.78 | 5.8 | -6.4 | 0 | 0.021 | 4.1 |
| ENSMUSG00000040105 | Ppapdc2 | -1.69 | 5.1 | -6.8 | 0 | 0.016 | 3.9 |
| ENSMUSG00000026104 | Stat1 | -0.45 | 7.8 | -5.7 | 0 | 0.043 | 2.7 |
| ENSMUSG00000025793 | Hgs | 1.07 | 6.6 | 5.6 | 0 | 0.043 | 2.7 |
| ENSMUSG00000038418 | Egr1 | 0.67 | 7.9 | 5.7 | 0 | 0.043 | 2.7 |
| ENSMUSG00000022412 | Mief1 | 0.71 | 6.7 | 5.5 | 0 | 0.046 | 2.5 |
| ENSMUSG00000003545 | Fosb | 1.43 | 5.4 | 5.6 | 0 | 0.043 | 2.4 |
| ENSMUSG00000034189 | Hsdl1 | 0.62 | 7.5 | 5.4 | 0 | 0.050 | 2.2 |
| ENSMUSG00000047181 | Samd14 | 1.18 | 4.8 | 5.6 | 0 | 0.043 | 2.1 |
logFC in this table represents the log fold-change in expression per unit change in koStatus (in other words, the log FC in wild-type relative to KO).
| ensemblID | mgiSymbol | logFC | AveExpr | t | P.Value | adj.P.Val | B |
|---|---|---|---|---|---|---|---|
| ENSMUSG00000025017 | Pik3ap1 | 1.1 | 5.9 | 2.6 | 0.019 | 0.34 | -3.3 |
Based on previous inspection of the aligned data in IGV, the knockout of BCAP does not manifest entirely cleanly on the RNA level. As seen in the plots below, BCAP has at least low counts in long-term HSCs and moderate counts in short-term HSCs, even in libraries from KO mice.
Design 2: expression ~ koStatus + hscPop
Because HSC population (i.e., hscPop: long- vs. short-term) accounts for such a large fraction of variance in the data, it makes sense to control for this variable by including it as a parameter in the model. We lose a degree of freedom in the process, but retain more DoF than if we were to subset the data and focus only on long- or short-term mice.
# Define the design matrix, including terms corresponding to KO status and HSC
# population; use voom to calculate transformed expression values
koHscDesign <- model.matrix(~ koStatus + hscPop, data = groupDat)
koHscVoom <- voomWithQualityWeights(dge, design = koHscDesign,
plot = TRUE)
# Fit model for the group design
koHscFit <- lmFit(koHscVoom, koHscDesign)
koHscFit <- eBayes(koHscFit)
koHscResults <- topTable(koHscFit, coef = 2, number = nrow(dge))
While the model includes parameters for both koStatus and hscPop, we’re most interested in the effect of koStatus on expression. By examining the results for the 2nd model coefficient (1st coefficient is the intercept), we can interpret logFC in the table below as the log FC in expression in WT relative to KO mice, when all other parameters are held constant.
koStatus:
| ensemblID | mgiSymbol | logFC | AveExpr | t | P.Value | adj.P.Val | B |
|---|---|---|---|---|---|---|---|
| ENSMUSG00000025035 | Arl3 | 1.32 | 5.5 | 8.4 | 0 | 0.002 | 7.2 |
| ENSMUSG00000064215 | Ifi27 | -0.64 | 7.3 | -7.3 | 0 | 0.007 | 5.5 |
| ENSMUSG00000040105 | Ppapdc2 | -1.72 | 5.1 | -6.7 | 0 | 0.015 | 4.3 |
| ENSMUSG00000092250 | Gm20467 | -0.77 | 5.8 | -6.5 | 0 | 0.016 | 4.2 |
| ENSMUSG00000026104 | Stat1 | -0.45 | 7.8 | -6.4 | 0 | 0.017 | 3.7 |
| ENSMUSG00000003545 | Fosb | 1.38 | 5.4 | 6.0 | 0 | 0.028 | 3.2 |
| ENSMUSG00000032690 | Oas2 | -1.28 | 5.8 | -5.8 | 0 | 0.028 | 3.0 |
| ENSMUSG00000047181 | Samd14 | 1.22 | 4.8 | 5.9 | 0 | 0.028 | 2.9 |
| ENSMUSG00000022412 | Mief1 | 0.72 | 6.7 | 5.8 | 0 | 0.028 | 2.9 |
| ENSMUSG00000034189 | Hsdl1 | 0.61 | 7.5 | 5.7 | 0 | 0.034 | 2.4 |
| ENSMUSG00000025793 | Hgs | 1.08 | 6.6 | 5.6 | 0 | 0.038 | 2.4 |
| ENSMUSG00000039738 | Slx4 | -0.97 | 5.4 | -5.4 | 0 | 0.038 | 2.2 |
| ENSMUSG00000066877 | Nck2 | 1.21 | 4.8 | 5.4 | 0 | 0.038 | 2.1 |
| ENSMUSG00000038418 | Egr1 | 0.66 | 7.9 | 5.5 | 0 | 0.038 | 1.9 |
Pik3ap1 is still not significant after multiple testing correction:
| ensemblID | mgiSymbol | logFC | AveExpr | t | P.Value | adj.P.Val | B |
|---|---|---|---|---|---|---|---|
| ENSMUSG00000025017 | Pik3ap1 | 0.97 | 5.9 | 4.4 | 0 | 0.087 | 0.14 |
To dig a bit further into PC1 / ng_per_ul, which might be introducing unwanted noise into the model, I identified surrogate variables using the sva package. Surrogate variables represent sources of variance in the data, not accounted for by the primary design variables (i.e., the factors of interest); SVA is commonly used to identify and remove potential confounding variables when there may not be clear “batches” among samples.
SVAs were computed on the normalized data with the model matrix for the ~ koStatus + hscPop design
As shown in the plots below, PC1, SV1, and also the voom sample quality weights all appear to be correlated with RNA concentration (ng_per_ul).
koSva <- sva(koHscVoom$E, koHscVoom$design)
## Number of significant surrogate variables is: 3
## Iteration (out of 5 ):1 2 3 4 5
Design 3: expression ~ koStatus + hscPop + ng_per_ul
To control for RNA concentration, I added the ng_per_ul as a parameter in the model.
# Define the design matrix, including terms corresponding to KO status, HSC
# population, and RNA concentration; use voom to calculate transformed
# expression values
koHscConcDesign <- model.matrix(~ koStatus + hscPop + ng_per_ul,
data = confoundDat)
koHscConcVoom <- voomWithQualityWeights(dge, design = koHscDesign,
plot = TRUE)
Notably, voom still down-weights the libraries with low ng_per_ul, suggesting that the “poor” quality in these samples is not fully explained by the RNA concentration variable.
# Fit model for the group design
koHscConcFit <- lmFit(koHscConcVoom, koHscConcDesign)
koHscConcFit <- eBayes(koHscConcFit)
koHscConcResults <- topTable(koHscConcFit, coef = 2, number = nrow(dge))
This new model produces 48 significant genes for koStatus:
| ensemblID | mgiSymbol | logFC | AveExpr | t | P.Value | adj.P.Val | B |
|---|---|---|---|---|---|---|---|
| ENSMUSG00000025035 | Arl3 | 1.39 | 5.5 | 9.3 | 0 | 0.001 | 7.62 |
| ENSMUSG00000029287 | Tgfbr3 | 0.69 | 7.0 | 7.5 | 0 | 0.008 | 5.62 |
| ENSMUSG00000064215 | Ifi27 | -0.65 | 7.3 | -7.0 | 0 | 0.011 | 4.73 |
| ENSMUSG00000032690 | Oas2 | -1.37 | 5.8 | -6.9 | 0 | 0.011 | 4.58 |
| ENSMUSG00000034189 | Hsdl1 | 0.66 | 7.5 | 6.7 | 0 | 0.012 | 4.24 |
| ENSMUSG00000002413 | Braf | 0.80 | 7.3 | 6.4 | 0 | 0.016 | 3.80 |
| ENSMUSG00000026104 | Stat1 | -0.45 | 7.8 | -6.2 | 0 | 0.018 | 3.27 |
| ENSMUSG00000040105 | Ppapdc2 | -1.69 | 5.1 | -6.4 | 0 | 0.016 | 3.25 |
| ENSMUSG00000092250 | Gm20467 | -0.75 | 5.8 | -6.1 | 0 | 0.018 | 3.25 |
| ENSMUSG00000022412 | Mief1 | 0.75 | 6.7 | 6.0 | 0 | 0.018 | 3.15 |
| ENSMUSG00000003545 | Fosb | 1.43 | 5.4 | 6.1 | 0 | 0.018 | 3.06 |
| ENSMUSG00000025212 | Sfxn3 | 0.73 | 6.9 | 5.8 | 0 | 0.024 | 2.63 |
| ENSMUSG00000066877 | Nck2 | 1.29 | 4.8 | 6.0 | 0 | 0.020 | 2.61 |
| ENSMUSG00000047181 | Samd14 | 1.26 | 4.8 | 5.9 | 0 | 0.022 | 2.46 |
| ENSMUSG00000022377 | Asap1 | -0.51 | 7.4 | -5.7 | 0 | 0.025 | 2.40 |
| ENSMUSG00000060126 | Tpt1 | 0.95 | 5.8 | 5.5 | 0 | 0.026 | 2.23 |
| ENSMUSG00000030870 | Ubfd1 | 0.74 | 7.4 | 5.6 | 0 | 0.026 | 2.21 |
| ENSMUSG00000016496 | Cd274 | 0.59 | 7.5 | 5.5 | 0 | 0.026 | 2.06 |
| ENSMUSG00000036478 | Btg1 | 0.88 | 6.4 | 5.4 | 0 | 0.031 | 2.03 |
| ENSMUSG00000033126 | Ybey | 1.40 | 4.5 | 5.7 | 0 | 0.025 | 1.94 |
| ENSMUSG00000001280 | Sp1 | 0.46 | 8.2 | 5.5 | 0 | 0.026 | 1.91 |
| ENSMUSG00000034487 | Kdelc2 | 0.98 | 6.0 | 5.3 | 0 | 0.035 | 1.79 |
| ENSMUSG00000000861 | Bcl11a | 0.50 | 8.9 | 5.5 | 0 | 0.026 | 1.78 |
| ENSMUSG00000025793 | Hgs | 1.02 | 6.6 | 5.3 | 0 | 0.035 | 1.70 |
| ENSMUSG00000009905 | Kdsr | 0.53 | 7.2 | 5.3 | 0 | 0.035 | 1.58 |
| ENSMUSG00000024750 | Zfand5 | 0.43 | 8.4 | 5.3 | 0 | 0.035 | 1.42 |
| ENSMUSG00000039738 | Slx4 | -0.92 | 5.4 | -5.1 | 0 | 0.044 | 1.39 |
| ENSMUSG00000038418 | Egr1 | 0.66 | 7.9 | 5.2 | 0 | 0.036 | 1.36 |
| ENSMUSG00000021704 | Mtx3 | 0.64 | 6.4 | 5.0 | 0 | 0.046 | 1.27 |
| ENSMUSG00000022565 | Plec | 0.81 | 7.6 | 5.1 | 0 | 0.042 | 1.24 |
| ENSMUSG00000052776 | Oas1a | -0.67 | 5.3 | -5.0 | 0 | 0.047 | 1.23 |
| ENSMUSG00000042198 | Chchd7 | 0.60 | 6.6 | 5.0 | 0 | 0.046 | 1.21 |
| ENSMUSG00000069114 | Zbtb10 | 0.93 | 5.8 | 4.9 | 0 | 0.047 | 1.13 |
| ENSMUSG00000021098 | 4930447C04Rik | 0.84 | 5.4 | 4.9 | 0 | 0.047 | 1.06 |
| ENSMUSG00000047632 | Fgfbp3 | 0.90 | 5.1 | 4.9 | 0 | 0.047 | 1.04 |
| ENSMUSG00000050229 | Pigm | 0.56 | 6.5 | 4.9 | 0 | 0.047 | 1.01 |
| ENSMUSG00000000682 | Cd52 | -0.85 | 4.5 | -4.9 | 0 | 0.047 | 0.98 |
| ENSMUSG00000072244 | Trim6 | 0.92 | 6.6 | 4.9 | 0 | 0.047 | 0.93 |
| ENSMUSG00000048249 | Crebrf | 0.39 | 8.0 | 5.0 | 0 | 0.046 | 0.91 |
| ENSMUSG00000022263 | Trio | 0.85 | 5.6 | 4.8 | 0 | 0.048 | 0.90 |
| ENSMUSG00000015488 | Cacfd1 | 1.06 | 5.3 | 4.8 | 0 | 0.048 | 0.89 |
| ENSMUSG00000091219 | NA | 1.27 | 5.3 | 4.8 | 0 | 0.048 | 0.88 |
| ENSMUSG00000025134 | Alyref | 0.59 | 7.3 | 4.9 | 0 | 0.047 | 0.86 |
| ENSMUSG00000040483 | Xaf1 | -0.65 | 6.5 | -4.8 | 0 | 0.048 | 0.83 |
| ENSMUSG00000075602 | Ly6a | -0.47 | 6.8 | -4.8 | 0 | 0.048 | 0.82 |
| ENSMUSG00000000184 | Ccnd2 | 0.33 | 8.9 | 4.9 | 0 | 0.047 | 0.57 |
| ENSMUSG00000011267 | Zfp296 | -1.61 | 4.2 | -4.9 | 0 | 0.047 | 0.54 |
| ENSMUSG00000025213 | Kazald1 | 5.11 | 1.5 | 5.2 | 0 | 0.040 | -2.42 |
Pik3ap1 is still not significant after multiple testing correction: kable(koHscConcResults %>% filter(mgiSymbol == "Pik3ap1"), digits = 3, format = "html")
Design 3.5: expression ~ koStatus + hscPop + ng_per_ul, no quality weighting of samples
Because I’ve accounted for the most obvious source of the variance explained by PC1 - and because ng_per_ul also appears to be correlated with the sample quality weights from voom - I opted to train an additional model without the quality weighting.
# Define the design matrix, including terms corresponding to KO status, HSC
# population, and RNA concentration; use voom to calculate transformed
# expression values
koHscConcVoomNoQual <- voom(dge, design = koHscDesign, plot = TRUE)
# Fit model for the group design
koHscConcNoQualFit <- lmFit(koHscConcVoomNoQual, koHscConcDesign)
koHscConcNoQualFit <- eBayes(koHscConcNoQualFit)
koHscConcNoQualResults <- topTable(koHscConcNoQualFit, coef = 2, number = nrow(dge))
This new model produces 132 significant genes for koStatus:
| ensemblID | mgiSymbol | logFC | AveExpr | t | P.Value | adj.P.Val | B |
|---|---|---|---|---|---|---|---|
| ENSMUSG00000025035 | Arl3 | 1.37 | 5.5 | 8.8 | 0.000 | 0.002 | 6.965 |
| ENSMUSG00000029287 | Tgfbr3 | 0.72 | 7.0 | 6.9 | 0.000 | 0.012 | 4.778 |
| ENSMUSG00000034189 | Hsdl1 | 0.79 | 7.5 | 6.7 | 0.000 | 0.012 | 4.440 |
| ENSMUSG00000064215 | Ifi27 | -0.72 | 7.3 | -6.7 | 0.000 | 0.012 | 4.364 |
| ENSMUSG00000025212 | Sfxn3 | 0.81 | 6.9 | 6.6 | 0.000 | 0.012 | 4.219 |
| ENSMUSG00000002413 | Braf | 0.89 | 7.3 | 6.5 | 0.000 | 0.012 | 4.128 |
| ENSMUSG00000036478 | Btg1 | 1.03 | 6.4 | 6.4 | 0.000 | 0.012 | 3.901 |
| ENSMUSG00000060126 | Tpt1 | 1.13 | 5.8 | 6.4 | 0.000 | 0.012 | 3.845 |
| ENSMUSG00000030870 | Ubfd1 | 0.83 | 7.4 | 6.2 | 0.000 | 0.014 | 3.463 |
| ENSMUSG00000001280 | Sp1 | 0.57 | 8.2 | 6.1 | 0.000 | 0.014 | 3.292 |
| ENSMUSG00000022412 | Mief1 | 0.83 | 6.7 | 6.0 | 0.000 | 0.016 | 3.190 |
| ENSMUSG00000040105 | Ppapdc2 | -1.60 | 5.1 | -6.2 | 0.000 | 0.014 | 3.025 |
| ENSMUSG00000092250 | Gm20467 | -0.81 | 5.8 | -5.9 | 0.000 | 0.017 | 3.005 |
| ENSMUSG00000047181 | Samd14 | 1.22 | 4.8 | 5.9 | 0.000 | 0.017 | 2.529 |
| ENSMUSG00000025793 | Hgs | 1.13 | 6.6 | 5.6 | 0.000 | 0.026 | 2.486 |
| ENSMUSG00000016496 | Cd274 | 0.67 | 7.5 | 5.6 | 0.000 | 0.026 | 2.431 |
| ENSMUSG00000032690 | Oas2 | -1.38 | 5.8 | -5.5 | 0.000 | 0.027 | 2.319 |
| ENSMUSG00000025134 | Alyref | 0.70 | 7.3 | 5.5 | 0.000 | 0.027 | 2.173 |
| ENSMUSG00000038872 | Zfhx3 | 0.98 | 6.8 | 5.4 | 0.000 | 0.027 | 2.099 |
| ENSMUSG00000066877 | Nck2 | 1.26 | 4.8 | 5.6 | 0.000 | 0.026 | 2.064 |
| ENSMUSG00000029050 | Ski | 0.77 | 7.1 | 5.4 | 0.000 | 0.027 | 1.953 |
| ENSMUSG00000038418 | Egr1 | 0.68 | 7.9 | 5.4 | 0.000 | 0.027 | 1.938 |
| ENSMUSG00000034487 | Kdelc2 | 1.04 | 6.0 | 5.3 | 0.000 | 0.027 | 1.928 |
| ENSMUSG00000003545 | Fosb | 1.43 | 5.4 | 5.4 | 0.000 | 0.027 | 1.924 |
| ENSMUSG00000029821 | Dfna5 | 0.65 | 7.3 | 5.4 | 0.000 | 0.027 | 1.919 |
| ENSMUSG00000042198 | Chchd7 | 0.68 | 6.6 | 5.3 | 0.000 | 0.027 | 1.900 |
| ENSMUSG00000022565 | Plec | 0.91 | 7.6 | 5.4 | 0.000 | 0.027 | 1.869 |
| ENSMUSG00000022637 | Cblb | 1.10 | 7.3 | 5.3 | 0.000 | 0.029 | 1.707 |
| ENSMUSG00000024750 | Zfand5 | 0.51 | 8.4 | 5.3 | 0.000 | 0.027 | 1.654 |
| ENSMUSG00000072244 | Trim6 | 0.97 | 6.6 | 5.2 | 0.000 | 0.032 | 1.585 |
| ENSMUSG00000041815 | Poldip3 | 0.74 | 7.7 | 5.2 | 0.000 | 0.030 | 1.561 |
| ENSMUSG00000009905 | Kdsr | 0.61 | 7.2 | 5.1 | 0.000 | 0.034 | 1.398 |
| ENSMUSG00000000184 | Ccnd2 | 0.39 | 8.9 | 5.2 | 0.000 | 0.030 | 1.330 |
| ENSMUSG00000021098 | 4930447C04Rik | 0.95 | 5.4 | 5.0 | 0.000 | 0.037 | 1.272 |
| ENSMUSG00000069662 | Marcks | 0.99 | 7.0 | 5.0 | 0.000 | 0.037 | 1.257 |
| ENSMUSG00000075602 | Ly6a | -0.53 | 6.8 | -5.0 | 0.000 | 0.037 | 1.242 |
| ENSMUSG00000033126 | Ybey | 1.31 | 4.5 | 5.2 | 0.000 | 0.030 | 1.222 |
| ENSMUSG00000023026 | Dip2b | 0.77 | 7.8 | 5.0 | 0.000 | 0.037 | 1.167 |
| ENSMUSG00000087370 | Tmem170b | 0.77 | 7.2 | 5.0 | 0.000 | 0.037 | 1.151 |
| ENSMUSG00000027405 | Nop56 | -0.51 | 7.5 | -5.0 | 0.000 | 0.037 | 1.142 |
| ENSMUSG00000047632 | Fgfbp3 | 1.00 | 5.1 | 4.9 | 0.000 | 0.037 | 1.095 |
| ENSMUSG00000039738 | Slx4 | -0.92 | 5.4 | -4.9 | 0.000 | 0.038 | 1.073 |
| ENSMUSG00000021712 | Trim23 | 0.81 | 7.8 | 5.0 | 0.000 | 0.037 | 1.061 |
| ENSMUSG00000000682 | Cd52 | -0.97 | 4.5 | -4.9 | 0.000 | 0.037 | 1.017 |
| ENSMUSG00000034158 | Lrrc58 | 0.71 | 7.8 | 5.0 | 0.000 | 0.037 | 1.016 |
| ENSMUSG00000091953 | NA | 1.03 | 5.1 | 4.9 | 0.000 | 0.038 | 0.981 |
| ENSMUSG00000050229 | Pigm | 0.60 | 6.5 | 4.8 | 0.000 | 0.043 | 0.863 |
| ENSMUSG00000026434 | Nucks1 | 0.58 | 8.6 | 4.9 | 0.000 | 0.037 | 0.829 |
| ENSMUSG00000048154 | Kmt2d | 1.36 | 5.6 | 4.7 | 0.000 | 0.044 | 0.805 |
| ENSMUSG00000038170 | Pde4dip | 0.96 | 6.4 | 4.7 | 0.000 | 0.044 | 0.795 |
| ENSMUSG00000044786 | Zfp36 | 0.40 | 7.1 | 4.8 | 0.000 | 0.043 | 0.784 |
| ENSMUSG00000026064 | Ptp4a1 | 1.05 | 6.2 | 4.7 | 0.000 | 0.044 | 0.774 |
| ENSMUSG00000037336 | Mfsd2b | -1.65 | 5.4 | -4.7 | 0.000 | 0.044 | 0.764 |
| ENSMUSG00000059900 | Tmem40 | -0.59 | 6.9 | -4.8 | 0.000 | 0.044 | 0.757 |
| ENSMUSG00000053094 | Tmem248 | 0.67 | 7.3 | 4.8 | 0.000 | 0.043 | 0.746 |
| ENSMUSG00000054408 | Spcs3 | 0.86 | 8.5 | 4.9 | 0.000 | 0.038 | 0.731 |
| ENSMUSG00000029438 | Bcl7a | 1.14 | 6.6 | 4.7 | 0.000 | 0.044 | 0.699 |
| ENSMUSG00000034928 | Rnf44 | 0.63 | 6.4 | 4.7 | 0.000 | 0.044 | 0.668 |
| ENSMUSG00000026113 | Inpp4a | 1.02 | 5.2 | 4.7 | 0.000 | 0.044 | 0.647 |
| ENSMUSG00000008958 | Vps72 | 0.79 | 6.7 | 4.7 | 0.000 | 0.044 | 0.593 |
| ENSMUSG00000043505 | Gimap5 | 0.50 | 7.4 | 4.7 | 0.000 | 0.044 | 0.562 |
| ENSMUSG00000020160 | Meis1 | 0.61 | 9.5 | 4.9 | 0.000 | 0.038 | 0.537 |
| ENSMUSG00000092060 | Bend4 | 1.04 | 5.6 | 4.6 | 0.000 | 0.047 | 0.528 |
| ENSMUSG00000070576 | Mn1 | 1.05 | 5.2 | 4.6 | 0.000 | 0.047 | 0.511 |
| ENSMUSG00000026696 | Vamp4 | 0.82 | 6.7 | 4.6 | 0.000 | 0.046 | 0.468 |
| ENSMUSG00000062376 | 2010012O05Rik | 0.74 | 6.9 | 4.6 | 0.000 | 0.046 | 0.444 |
| ENSMUSG00000026104 | Stat1 | -0.42 | 7.8 | -4.7 | 0.000 | 0.044 | 0.391 |
| ENSMUSG00000033545 | Znrf1 | 0.46 | 7.3 | 4.6 | 0.000 | 0.046 | 0.375 |
| ENSMUSG00000032579 | Hemk1 | 0.77 | 5.4 | 4.5 | 0.000 | 0.049 | 0.362 |
| ENSMUSG00000008730 | Hipk1 | 0.55 | 8.6 | 4.7 | 0.000 | 0.044 | 0.359 |
| ENSMUSG00000028152 | Tspan5 | 1.04 | 5.9 | 4.5 | 0.000 | 0.049 | 0.319 |
| ENSMUSG00000052776 | Oas1a | -0.75 | 5.3 | -4.5 | 0.000 | 0.049 | 0.278 |
| ENSMUSG00000013593 | Ndufs2 | -0.51 | 7.1 | -4.5 | 0.000 | 0.048 | 0.275 |
| ENSMUSG00000039126 | Prune2 | -2.07 | 4.7 | -4.6 | 0.000 | 0.045 | 0.275 |
| ENSMUSG00000069769 | Msi2 | 0.75 | 8.8 | 4.7 | 0.000 | 0.044 | 0.273 |
| ENSMUSG00000029647 | Pan3 | 0.52 | 8.5 | 4.7 | 0.000 | 0.044 | 0.253 |
| ENSMUSG00000021962 | Dcp1a | -0.99 | 5.5 | -4.4 | 0.000 | 0.049 | 0.236 |
| ENSMUSG00000003131 | Pafah1b2 | 0.50 | 8.2 | 4.6 | 0.000 | 0.046 | 0.229 |
| ENSMUSG00000009470 | Tnpo1 | 0.54 | 7.8 | 4.6 | 0.000 | 0.047 | 0.195 |
| ENSMUSG00000025939 | Ube2w | 0.44 | 6.9 | 4.5 | 0.000 | 0.049 | 0.165 |
| ENSMUSG00000031627 | Irf2 | 0.46 | 7.1 | 4.5 | 0.000 | 0.049 | 0.113 |
| ENSMUSG00000022263 | Trio | 0.84 | 5.6 | 4.4 | 0.000 | 0.049 | 0.098 |
| ENSMUSG00000026782 | Abi2 | 0.80 | 8.0 | 4.5 | 0.000 | 0.049 | 0.087 |
| ENSMUSG00000040433 | Zbtb38 | 0.47 | 8.3 | 4.5 | 0.000 | 0.048 | 0.085 |
| ENSMUSG00000026276 | Sept2 | 0.62 | 7.9 | 4.5 | 0.000 | 0.049 | 0.085 |
| ENSMUSG00000071072 | Ptges3 | 0.83 | 6.1 | 4.4 | 0.000 | 0.049 | 0.075 |
| ENSMUSG00000023147 | Wrb | 0.84 | 5.7 | 4.3 | 0.000 | 0.049 | 0.056 |
| ENSMUSG00000005533 | Igf1r | 1.48 | 5.8 | 4.3 | 0.000 | 0.049 | 0.049 |
| ENSMUSG00000022952 | Runx1 | 0.45 | 8.2 | 4.5 | 0.000 | 0.049 | 0.046 |
| ENSMUSG00000007670 | Khsrp | 1.35 | 5.4 | 4.3 | 0.000 | 0.049 | 0.035 |
| ENSMUSG00000079419 | Ms4a6c | -0.84 | 5.8 | -4.3 | 0.001 | 0.049 | 0.022 |
| ENSMUSG00000069114 | Zbtb10 | 1.08 | 5.8 | 4.3 | 0.001 | 0.049 | 0.013 |
| ENSMUSG00000025017 | Pik3ap1 | 1.01 | 5.9 | 4.3 | 0.000 | 0.049 | 0.001 |
| ENSMUSG00000034300 | Fam53c | 1.08 | 6.6 | 4.4 | 0.000 | 0.049 | -0.002 |
| ENSMUSG00000029322 | Plac8 | -0.83 | 6.2 | -4.3 | 0.001 | 0.049 | -0.004 |
| ENSMUSG00000032035 | Ets1 | 1.29 | 5.9 | 4.3 | 0.001 | 0.049 | -0.005 |
| ENSMUSG00000020176 | Grb10 | 0.54 | 6.8 | 4.4 | 0.000 | 0.049 | -0.010 |
| ENSMUSG00000091219 | NA | 1.24 | 5.3 | 4.3 | 0.001 | 0.049 | -0.011 |
| ENSMUSG00000022377 | Asap1 | -0.47 | 7.4 | -4.4 | 0.000 | 0.049 | -0.015 |
| ENSMUSG00000081594 | Gm15467 | 1.00 | 4.9 | 4.3 | 0.001 | 0.049 | -0.039 |
| ENSMUSG00000062078 | Qk | 0.63 | 8.3 | 4.5 | 0.000 | 0.049 | -0.042 |
| ENSMUSG00000033713 | Foxn3 | 0.44 | 7.4 | 4.4 | 0.000 | 0.049 | -0.048 |
| ENSMUSG00000033499 | Larp4b | 0.61 | 7.4 | 4.4 | 0.000 | 0.049 | -0.057 |
| ENSMUSG00000038212 | Hiatl1 | 0.50 | 7.2 | 4.4 | 0.000 | 0.049 | -0.058 |
| ENSMUSG00000052713 | Zfp608 | 0.30 | 8.2 | 4.5 | 0.000 | 0.049 | -0.060 |
| ENSMUSG00000048249 | Crebrf | 0.38 | 8.0 | 4.4 | 0.000 | 0.049 | -0.094 |
| ENSMUSG00000075376 | Rc3h2 | 0.74 | 8.5 | 4.5 | 0.000 | 0.049 | -0.117 |
| ENSMUSG00000028521 | Slc35d1 | 0.78 | 7.5 | 4.4 | 0.000 | 0.049 | -0.123 |
| ENSMUSG00000063410 | Stk24 | 0.62 | 6.3 | 4.3 | 0.001 | 0.050 | -0.132 |
| ENSMUSG00000060510 | Zfp266 | 0.73 | 7.8 | 4.4 | 0.000 | 0.049 | -0.143 |
| ENSMUSG00000021488 | Nsd1 | 0.55 | 8.6 | 4.5 | 0.000 | 0.049 | -0.154 |
| ENSMUSG00000062646 | Ganc | 0.64 | 7.0 | 4.3 | 0.000 | 0.049 | -0.154 |
| ENSMUSG00000001794 | Capns1 | 0.45 | 7.7 | 4.4 | 0.000 | 0.049 | -0.160 |
| ENSMUSG00000045362 | Tnfrsf26 | 0.84 | 7.1 | 4.3 | 0.001 | 0.049 | -0.178 |
| ENSMUSG00000066551 | Hmgb1 | 0.55 | 7.1 | 4.3 | 0.001 | 0.049 | -0.246 |
| ENSMUSG00000021134 | Srsf5 | 0.50 | 8.3 | 4.4 | 0.000 | 0.049 | -0.247 |
| ENSMUSG00000056999 | Ide | 0.59 | 7.6 | 4.3 | 0.000 | 0.049 | -0.263 |
| ENSMUSG00000062822 | 4833420G17Rik | 0.39 | 7.7 | 4.3 | 0.000 | 0.049 | -0.276 |
| ENSMUSG00000020962 | Gtf2a1 | 0.81 | 7.2 | 4.3 | 0.001 | 0.049 | -0.288 |
| ENSMUSG00000048379 | Socs4 | 0.56 | 7.8 | 4.3 | 0.000 | 0.049 | -0.295 |
| ENSMUSG00000004880 | Lbr | 0.52 | 8.1 | 4.3 | 0.000 | 0.049 | -0.310 |
| ENSMUSG00000000861 | Bcl11a | 0.51 | 8.9 | 4.4 | 0.000 | 0.049 | -0.368 |
| ENSMUSG00000020015 | Cdk17 | 0.45 | 8.6 | 4.4 | 0.000 | 0.049 | -0.371 |
| ENSMUSG00000003949 | Hlf | 0.38 | 9.2 | 4.4 | 0.000 | 0.049 | -0.388 |
| ENSMUSG00000025935 | Tram1 | 0.44 | 8.4 | 4.3 | 0.001 | 0.049 | -0.432 |
| ENSMUSG00000069056 | NA | 1.65 | 4.0 | 4.3 | 0.001 | 0.049 | -0.517 |
| ENSMUSG00000037965 | Zc3h7a | 0.41 | 8.5 | 4.3 | 0.001 | 0.049 | -0.520 |
| ENSMUSG00000026034 | Clk1 | 0.53 | 9.1 | 4.3 | 0.001 | 0.049 | -0.623 |
| ENSMUSG00000032372 | Plscr2 | -4.38 | 2.6 | -4.5 | 0.000 | 0.049 | -2.125 |
| ENSMUSG00000001943 | Vsig2 | -2.67 | 1.9 | -4.4 | 0.000 | 0.049 | -2.213 |
| ENSMUSG00000025213 | Kazald1 | 4.78 | 1.5 | 4.8 | 0.000 | 0.044 | -2.510 |
| ENSMUSG00000064202 | 4430402I18Rik | -3.62 | 1.4 | -4.5 | 0.000 | 0.049 | -2.518 |
Pik3ap1 is borderline significant after multiple testing correction:
| ensemblID | mgiSymbol | logFC | AveExpr | t | P.Value | adj.P.Val | B |
|---|---|---|---|---|---|---|---|
| ENSMUSG00000025017 | Pik3ap1 | 1 | 5.9 | 4.3 | 0 | 0.049 | 0.001 |
While, for the reasons stated above, differential BCAP expression should probably not be viewed as a positive control, it’s at least somewhat encouraging to see the model better account for this difference.
The plot below shows the full list of genes that were found to be significant with one or more of the above model designs. For each gene, a colored bar indicates the design under which it was found to be significant (the size of the bar is scaled by the adjusted p-value).
The gene lists for designs 1 and 2 overlap completely with designs 3 and 3.5; there were 4 genes from design 3 that were not significant in design 3.5.
Genes with significantly different expression (adj. p-value < 0.05) as a function of BCAP KO status:
Just to check, I also tested for differential expression with only libraries from the same HSC population included. For both long- and short-term HSCs, there were no significant genes for koStatus.
# Get data for libraries from long-term HSC population
groupDatLong <- groupDat %>%
filter(hscPop == "long")
countDatLong <- countDat[, names(countDat) %in% groupDatLong$libID]
# Filter genes with (cpm > 10) in < 10% of samples
dgeLong = setUpDGEList(countData = countDatLong, geneData = geneDat,
filterCount = 10,
filterPercentage = 0.20)
## keepRows
## FALSE TRUE
## 26152 11839
Design: expression ~ koStatus
# Define the design matrix, only including terms corresponding to KO status;
# use voom to calculate transformed expression values
koDesignLong <- model.matrix(~ koStatus, data = groupDatLong)
koVoomLong <- voomWithQualityWeights(dgeLong, design = koDesignLong,
plot = TRUE)
# Fit model for the group design
koFitLong <- lmFit(koVoomLong, koDesignLong)
koFitLong <- eBayes(koFitLong)
koResultsLong <- topTable(koFitLong, number = nrow(dgeLong))
This model produces 0 significant genes for koStatus.
# Get data for libraries from long-term HSC population
groupDatShort <- groupDat %>%
filter(hscPop == "short")
countDatShort <- countDat[, names(countDat) %in% groupDatShort$libID]
# Filter genes with (cpm > 10) in < 10% of samples
dgeShort = setUpDGEList(countData = countDatShort, geneData = geneDat,
filterCount = 10,
filterPercentage = 0.20)
## keepRows
## FALSE TRUE
## 26097 11894
Design: expression ~ koStatus
# Define the design matrix, only including terms corresponding to KO status;
# use voom to calculate transformed expression values
koDesignShort <- model.matrix(~ koStatus, data = groupDatShort)
koVoomShort <- voomWithQualityWeights(dgeShort, design = koDesignShort,
plot = TRUE)
# Fit model for the group design
koFitShort <- lmFit(koVoomShort, koDesignShort)
koFitShort <- eBayes(koFitShort)
koResultsShort <- topTable(koFitShort, number = nrow(dgeShort))
This model produces 0 significant genes for koStatus.
sessionInfo()
## R version 3.2.1 (2015-06-18)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: OS X 10.10.3 (Yosemite)
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] grid stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] ggfortify_0.0.3 proto_0.3-10 shiny_0.12.2
## [4] knitr_1.11 reshape2_1.4.1 readxl_0.1.0.9000
## [7] readr_0.1.1 stringr_1.0.0 ggthemes_2.2.1
## [10] dplyr_0.4.2 sva_3.14.0 genefilter_1.50.0
## [13] mgcv_1.8-7 nlme_3.1-121 cowplot_0.5.0
## [16] ggbiplot_0.55 scales_0.2.5 plyr_1.8.3
## [19] ggplot2_1.0.1 biomaRt_2.24.0 edgeR_3.10.2
## [22] limma_3.24.15
##
## loaded via a namespace (and not attached):
## [1] splines_3.2.1 lattice_0.20-33 colorspace_1.2-6
## [4] htmltools_0.2.6 stats4_3.2.1 yaml_2.1.13
## [7] XML_3.98-1.3 survival_2.38-3 DBI_0.3.1
## [10] BiocGenerics_0.14.0 munsell_0.4.2 gtable_0.1.2
## [13] codetools_0.2-14 evaluate_0.7.2 labeling_0.3
## [16] Biobase_2.28.0 IRanges_2.2.7 httpuv_1.3.3
## [19] GenomeInfoDb_1.4.2 parallel_3.2.1 AnnotationDbi_1.30.1
## [22] highr_0.5 Rcpp_0.12.0 xtable_1.7-4
## [25] formatR_1.2 S4Vectors_0.6.3 annotate_1.46.1
## [28] mime_0.3 gridExtra_2.0.0 digest_0.6.8
## [31] stringi_0.5-5 tools_3.2.1 bitops_1.0-6
## [34] magrittr_1.5 RCurl_1.95-4.7 lazyeval_0.1.10
## [37] RSQLite_1.0.0 tidyr_0.2.0 MASS_7.3-43
## [40] Matrix_1.2-2 assertthat_0.1 rmarkdown_0.7
## [43] R6_2.1.1